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1. INTRODUCTION 

Glioma is one of the primary brain tumors [1]. This tumor originated from glial cells, which are 
neuron-supporting cells that function to help specify synaptic contact and preserve the signaling capability of 
neurons [2]. If we see statistics 15,000 peoples with brain cancer annually are glioma patients. There are 4 
types of glioma classification based on cell types, they are astrocytoma, ependymoma, oligodendroglioma, 
and non-glioma [3]. Each type of tumor has certain property which varied depending on the location and the 
level of tumor malignancy [4]. Early diagnosis of astrocytoma and oligodendroglioma can be performed 
through clinical and radiological examination [5], [6]. Clinical examination is carried out to ascertain whether 
the patient has a brain disorder based on the symptoms experienced by the patient. In contrast, the 
radiological examination can be conducted using magnetic resonance imaging (MRI) imaging [7]. As we 
know that brain structure, size, and various forms of tumors increase the complexity level in classifying 
gliomas [8], [9]. In addition, there is the possibility of diagnostic errors that cause errors in handling, so a 
method is recommended to facilitate the classification of gliomas through medical images. 

As a branch of artificial intelligence (AI), machine learning (ML) is covering the design and 
development of algorithms on computer systems to enable computers to perform and develop behaviors 
based on empirical data such as sensor data or databases [10]. While there is the latest powerful methodology 
such as deep learning (DL) which solve weakness in ML. DL approach is an breakthrough algorithm in 
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solving human life problem nowadays [11]. Supported by overwhelming computational power and big - 
complex dataset problem solver, DL has proven to be superior in image processing and preferably in medical 
imaging [12]. 

Numerous research have been carried out in classifying medical image using ML approach. Some of 
the researches use k-means clustering to find tumors in medical images [13]. In this study, k-means generated 
an excellent result in segmenting the brain tumor. K-means method was modified so it could segment the 
data on a large scale. Several combination methods also applied such as support vector machine (SVM) with 
k-means [14]. 

In the area of deep neural network (DNN), we also can get several previous studies. In 2017, 
Esteva et al. [15] applied DNN method for skin lesion images classification. The test was conducted by 
validating the system with real dermatologist to diagnose the clinical images with two classification use 
cases. This study succeeded in achieving an accuracy of 93.3% [15]. Cho et al. [16] did research to classify 
gliomas into low-level and high-level glioma using multi-modal image radiomics features. The test was 
implemented using 220 images of high level glioma and 54 images of low level glioma. We can see 88% of 
accuracy from this study [16]. Other research shown great brain tumor segmentation using deep neural 
network [17], convolutional neural network (CNN) with 2-phase training and brain tumor image 
segmentation benchmark (BRATS) test data [18], CNN with 3x3 kernel [19] and many novel-adjusted CNN 
[20], [21], also CNN implementation in 3-dimentional MRI Scan [22], [23] and also extreme learning 
machine (ELM) [24]. 

Identifying and classify astrocytoma, ependymoma, oligodendroglioma, and unspecified glioma has 
become problem nowadays. Using a normal DL approach will sure give a better result than common methods 
such as feed forward neural network or SVM. However, we need to get the differences between existed 
methodology and proposed methodology. Based on previous study we can see that implementing DL on 
glioma tumor is already exist, however we spot several differences. The differences are in glioma dataset 
used in this study, we can say it has not been implemented with deep convolutional neural network (DCNN), 
especially DCNN with k-means segmentation which will be described in our proposed method later. 


2. RESEARCH METHOD 

The classification process of glioma brain tumor consists of several steps. These steps are pre- 
processing of training data and testing data, training process, and classification process using DCNN method, 
and the system testing process. Figure 1 shows the general architecture of this research. 
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Figure 1. General architecture 
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The proposed method for this study consists of several stages. The stages are as follows: acquisition 
of glioma-detected images. All preprocessed image will be divided into training and testing data to be train in 
DCNN environment and to be tested later using testing data. The image preprocessing process span from 
grayscale image conversion; image cropping to minimize the black background of the image; contrast 
enhancement to get more precise data and including segmentation and thresholding to get specify area of 
glioma. Then we applied segmentation using k-means clustering. Then we do train process which will create 
DCNN model. This DCNN model then will be used as classification engine to get the results. 


2.1. Data 

We are using benchmark data in this research, where the data is in the form of images with joint 
photographic expert group (JPEG) format, that has been used for the training process and are available the 
cancer imaging archive (TCIA) [25]. These images consist of three categories, namely astrocytoma, 
ependymoma, and oligodendroglioma, several image samples shown in Figure 2. In addition to the above 
categories, this study also includes the images of brains without gliomas, with a total of four types of 
gliomas. Using data from the same source, researchers categorized brains that did not have gliomas as brains 
without tumors. The total data carried out in this study were 350 data, in the form of MRI images, that have 
been converted to images in JPG format. The distribution of datasets in this study can be seen in Table 1. 


Table 1. Research data 


Dataset Astrocytoma Ependymoma Oligodendroglioma Normal Total 
Training data 80 80 80 20 260 
Testing data 25 25 25 15 90 

Total 105 105 105 45 350 
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Figure 2. Images samples of astrocytoma, ependymoma, and oligodendroglioma glioma tumor [25] 


2.2. K-means segmentation process 

The segmentation process using k-means begins with reading the input data which is the result of 
the file from the gaussian filter in the previous stage. Then, the width and height of the image will be 
obtained, followed by the pixel value of the image. After that the pixel that will be used as the center of the 
cluster (centroid) is determined. The number of centroids has the same number as the specified number of 
clusters (k). 

Then the calculation and sorting (sort) of the Euclidean distance values contained in the cluster will 
be carried out followed by partitioning (grouping) in each cluster. The grouping is done based on the 
calculation of the smallest Euclidean distance value generated between pixels and each centroid. The 
clustering is performed on all pixels so that the pixels are divided into groups according to the number of 
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clusters determined in the previous step. After pixels, the calculation of the average pixel value in each 
cluster is carried out. The resulting average pixel value will be the new centroid. After the new centroid value 
is obtained, the calculation and measurement of the Euclidean distance value, partition, and the calculation of 
the average pixel value will be carried out again until the number of iterations specified in the first step is 
successfully achieved or the convergence condition is successfully met. Convergence is a condition where 
the calculated mean value is equal to the centroid. When the convergence is met or the specified iteration has 
been carried out, the process will stop and the resulting output is a pixel that is in accordance with the desired 
number of clusters. 


2.3. Deep convolutional neural network 

DCNN is a DL based neural network approach used to analyze visual images, detect and recognize 
objects in an image, which are high-dimensional vectors, that will involve many parameters to characterize 
the network. Basically, DCNN is an advance version or deeper version of CNN. Bionic CNN was proposed 
to decrease the phase of pre-processing and alter the model architecture specifically for visual tasks. The 
CNN is usually arranged by a set of layers that can be grouped according to their function. CNN consists of 
neurons that have a weight, bias, and activation function with advance architecture and consist of two main 
parts such as feature extraction layer and fully-connected layer. 

The process of classifying glioma brain tumors is carried out through several steps. These steps are 
training and testing data processing, and classification process using DCNN method, and lastly testing 
process by the system. In the input section data will be divided into 2, training and testing data. The data will 
go through a preprocessing stage or called pre-processing, while the pre-processing process is consisted of 
scaling, contrast enhancement and thresholding. If the incoming data is the training data, it will pass through 
the training stage as well as classify the data according to its classification. The incoming testing data will be 
directly processed by the model that has been created through the previous training process. Then the data 
will come out and issue the results of the processed classification. 


2.4. Parameter setup in environment of deep convolutional neural network 

In order to get the optimal number of hidden layers, we do several parameter tunings which varies 
from 2 until 10 hidden layers, then we get the best result based on optimal hidden layer shown in testing. For 
number of neurons inside hidden layer we prefer using 12 neurons in a fully-connected layer. To produce the 
output from each layer are calculated by the parameter from input, weight, bias and activation function. In 
terms of activation function, we use the ReLU activation function for convolutional layer and the Softmax on 
the output layer to get categorical data results. While for optimizer to determine optimization inside DCNN, 
in this study we use Root Mean Square Propagation (RMSProp). Thus, in terms of batch size that can 
determine the number of observations made before weight changing based on computer specification and 
setup, we use default batch size, which is 32. For the epoch or number of repeated learning process. The 
greater the number of epochs, the higher the level of learning outcomes until our models reach convergence. 
The number of epochs used in this study is 100. 


3. RESULTS AND DISCUSSION 

Here in this section, we describe our application, we can see in the Figure 3 that the system is built 
in web-based application as a frontend with the DL Phyton in the backend. Figure 3 is the main user interface 
of our proposed system. Table 2 is the result of the process through which the proposed method goes. It can 
be seen the stages of image processing starting from the original image, scaling, contrast enhancement and 
thresholding processes. In this example we can see tumor region which will become input for DCNN model 
in our system. 

The standard used in evaluating the system performance using testing data is the gold standard, 
which is based on true false-positive negative. True positive (TP) is when the desired output and actual 
output generate the same result of tumor area detected. False positive (FP) is a state where the desired output 
should be an undetected tumor area while the actual output generates a detected tumor area. True negative 
(TN) is a condition where the desired output and actual output produce an undetected tumor area. Then, false 
negative (FN) is a state where the desired output is a detected tumor area while the actual output generates 
the opposite. The test was performed using a dataset of 90 data of patients with brain tumors of 
oligodendroglioma, astrocytoma and glioblastoma, and healthy brains. The result of data testing can be seen 
in Table 3. 

The amount of data used in this study amounted to 350 images for three classes of glioma brain 
cancer classified plus one type of brain classification that does not have a tumor/glioma. Due to the 
insufficient amount of data, the classification process has not yet perfected. Apart from the lack of data, some 
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data have similar characteristics and locations, which produced an error in the classification error. It is 
mandatory to have the correct feature extraction to process the image before the classification process. 


Convolutional Neural 


Figure 3. Glioma brain tumor classification GUI 


Table 2. Image pre-processing results 
Scaling Contrast Enhancement 


Original Image Thresholding 
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Based on the testing result of our proposed glioma brain tumor classification system, we get an 
average of training accuracy value of 90%. While for the speed of training process is in 184 seconds/epoch, 
thus it takes around 3 hours 7 minutes to complete training process up to 100 epochs. The test results of the 
glioma classification system using the DCNN, the system achieved 95.5% accuracy. 
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As we can see there are only 4 images that categorize as false classified. An example of an image 
that failed to be classified by the system can be seen in Figure 4(a). This image is an image of a type of 
astrocytoma that was misclassified by the system as an ependymoma. This is because the type of 
ependymoma is located in the ventricular part of the brain which is located adjacent to the tumor, therefore 
the image processing used is not quite precise because it has not been able to detect the exact location of the 
ventricular and part of the tumor. Another thing that causes the classification of this system to be disturbed is 
that the image processing section used is not yet capable of eliminating unnecessary parts, and incorrectly 
selecting tumors in the data. One of the data that failed to be classified is in Figure 4(b). In Figure 4(b), this 
image is an image of a type of astrocytoma that was misclassified by the system as oligodendroglioma. This 
is because the system has not been able to remove unnecessary parts as shown in Figure 4(c) and has not 
been able to select the correct tumor section as shown in Figure 4(d). 


Table 3. Testing data results 


Astrocytoma Ependymoma Oligodendroglioma No Tumor 
Astrocytoma 23 0 0 0 
Ependymoma 1 24 1 0 
Oligodendroglioma 1 1 24 0 
No Tumor 0 0 0 15 


(b) (c) (d) 


Figure 4. False classification images by the system as (a) ependymoma, (b) oligodendroglioma, (c) 
unnecessary parts and (d) the part that the system failed to pick up 


4. CONCLUSION 

DCNN method is capable of performing glioma classification using MRI images with an accuracy 
value of 90% using a dataset of 270 images for the training process. The accuracy was affected by the 
consistent training data in one class of glioma. For the testing process, the system obtained an accuracy of 
95.5% with a total of 90 testing data. For further research, it is strongly encouraged to have more data to 
increase the accuracy, as well as more methods in data image pre-processing and to perform data 
visualization to obtain the best data for the research. 
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